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Time-Bridge Variance Estimators 



Abstract 

We present a set of log-price integrated variance estimators, equal 
to the sum of open-high-low-close bridge estimators of spot variances 
within n subsequent time-step intervals. The main characteristics of 
some of the introduced estimators is to take into account the informa- 
tion on the occurrence times of the high and low values. The use of 
the high's and low's of the bridge associated with the original process 
makes the estimators significantly more efficient that the standard 
realized variance estimators and its generalizations. Adding the infor- 
mation on the occurrence times of the high and low values improves 
further the efficiency of the estimators, much above those of the well- 
known realized variance estimator and those derived from the sum 
of Garman and Klass spot variance estimators. The exact analytical 
results are derived for the case where the underlying log-price process 
is an Ito stochastic process. Our results suggests more efficient ways 
to record financial prices at intermediate frequencies. 
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1 Introduction 



The integrated variance is a crucial risk indicator of the stochastic log- 
price process within specific time intervals. Most of the existing high-frequency 
integrated variance estimators are modifications of the well-known realized 
volatility (see, for instance, Andersen et al. (2003), Ait-Sahalia (2005), Zhang 
et al. (2005)), and are based on the knowledge of the open and close prices 
of n time-step intervals dividing the whole time interval of interest. An- 
other common practice to estimate the variance of a log-price process is to 
use not two (open-close) log-prices within a given time-step, but four values, 
the so-called the open-high-low-close (OHLC) of the log-prices. Well-known 
examples are the Garman and Klass (G&K) (1980) and Parkinson (Park) 
(1980) spot variance estimators. 

The main goal of this paper is to demonstrate the efficiency of bridge 
OHLC integrated variance estimators, that use the knowledge of the high and 
low values of the bridge process derived from the original log-price process, 
as well as possibly the random occurrence times of these extrema within each 
time-step interval. We compare the efficiencies of these time-OHLC bridge 
estimators with the efficiency of the standard realized variance and with the 
efficiency of the integrated variance estimators based on the G&K estimators 
of the variance within each elementary time-step interval. We show that 
some time-OHLC integrated variance estimators achieve a very significant 
improvement in efficiency compared with the realized variance and the G&K 
integrated variance estimators. Another remarkable property of the proposed 
time-OHLC bridge estimators is that they depend much less on the drift of 
the log-price process than the realized variance and G&K integrated variance 
estimators. This has the great advantage of essentially removing the biases 
that affect the standard estimators, given that the drift (expected return) 
is in general the most poorly constrained statistical variable. We compare 
the efficiencies of the introduced integrated variance estimators using the Ito 
process as our workhorse to model the stochastic behavior of log-prices. 

Present databases record either all prices associated with transactions or 
prune the data to keep the OHLC at given time steps, for instance, sec- 
onds, minutes or days. The later records giving the OHLC of the realized 
log-prices do not allow the reconstruction of the OHLC (and even less the 
occurrence times of the high's and low's) for the associated bridge process in 
each elementary interval. Of course, one could construct the OHLC and any 
other useful information from the full time series of all transaction prices. 
But then, one could question the value of deriving new estimators based on 
a reduced information set. Therefore, the present paper can be considered 
as a normative exercise to learn about the fundamental limits of integrated 
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variance estimators. Our results are also useful in suggesting more efficient 
ways to record financial prices at intermediate frequencies: instead of record- 
ing the OHLC at the daily scale for instance, we propose that data centers 
and vendors should store to open and close of the real log-price and the 
high and low of the corresponding bridge in each day (or in any other cho- 
sen frequency). Our calculations below show that this information, which 
has the same cost and is as easy to obtain at the end of the day from the 
high frequency data, provides much more efficient estimators of the variance 
that can be stored for future use. The same conclusion holds true for other 
risk measures beyond variance such as higher order moments, but this is not 
explored in the present paper. 

The paper is organized as follows. Section 2 describes the properties of the 
well-known realized variance estimator, which we need in order to compare its 
efficiency with the efficiencies of the suggested time-OHLC bridge integrated 
variance estimators. Section 3 is devoted to the discussion of the efficiencies 
of the simple bridge integrated variance estimators, illustrating the compara- 
tive efficiency and unbiasedness of the bridge integrated variance estimators. 
This section written in a pedagogical style gradually introduces the readers 
in the area of homogeneous most efficient variance estimators. Section 4 pro- 
vides a detailed analysis of the efficiency of the OHL and time-OHLC bridge 
integrated variance estimators, which turn out to be significantly more effi- 
cient than the realized variance and the G&K integrated variance estimators. 
Section 5 describes the results of numerical simulations demonstrating the 
comparative efficiency of the proposed estimators. Section 6 concludes. The 
paper is completed by three appendix. Appendix A presents the essential 
properties of the canonical bridge. Appendix B derives the joint probability 
density function (pdf ) of the high value and of its occurrence time. Appendix 
C derives and gives the statistical properties of the joint distribution of the 
high and low values and of the occurrence time of the last extremum for the 
canonical bridge. 

2 Realized variance and beyond 

Henceforth, we assume that the log-price X(t) of a given security follows 
an Ito process 

dX(t) = n{t)dt + a(t)dW(t), X(0) = X , (1) 

where W(t) is a realization of the standard Wiener process, while /u,(t) is the 
drift process, and cr 2 (t) is the instantaneous variance of the log-price process 
X(t). 
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2.1 Definitions and basic properties of realized vari- 
ance 

Let us provide first some basic definitions and properties. 

Definition 1 The integrated variance of the process X(t) within the time 
interval t G (0,T) is 

D(T) := [ a\t)dt . (2) 
Jo 

Definition 2 The spot variance is defined within the time-step interval 

: [U-uU] , (3) 

by 

4eaiW):*e§i}:=(^-*i-i) 2 , X r .= X(t t ), t r .= iA, A = -. (4) 

n 

Definition 3 The well-known statistical estimator of the integrated variance 
is the so-called realized variance defined as 

n 

[X, X] T ■= Aeai{X(t) : t E Si} . (5) 
1=1 

Remark 1 For ltd processes (CQ) and for n — > oo, it is well-known that the 
realized variance converges in probability to the integrated one. 

However, for real data, the number n of available data points is always lim- 
ited, ultimately by the discreteness of the transaction flow and the associated 
microstructure noise. Such structures, which are not taken into account in 
the Ito log-price model, can be neglected in the use of the realized variance 
estimator if the discrete time step A is much larger than the inverse of the 
mean frequency v of the tick-by-tick transactions, so that n vT . 

Assumption 1 While A ^> we assume that A is sufficiently small in 
comparison with the time scales over which the drift process //(£) and the 
instantaneous variance o~ 2 (t) vary, so that one may replace the original Ito 
process (pp) by Wiener processes with drift 

dX\t) ~ Hidt + cndW(t), X'fc-i) = Xi^ u t e 

(6) 

Hi = const, (Tj = const . 
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Consider the special case of the Wiener process with drift 

X(t,n,o) = fd + <rW(t) . (7) 

Using the scale-invariance property of the Wiener process, the following iden- 
tity holds in law (represented by the symbol ~) 

D ieal {X(t, fi,a):te §,} ~ a 2 A[ 7 + W(l)] 2 = a 2 A ■ X 2 {1; 7) , (8) 

where 

X(t;<y)=<rt + W(t), 7=-VA, t G (0, 1) , (9) 

a 

is the canonical Wiener process with drift. Applying the identity in law flSJ) 
to the realized variance expression (j5J, fll]), we obtain 

n 

[X,X] T ~A^ ( x 2 ( 7l + ^) 2 , (10) 
i=i 

where {Wi} are iid Gaussian variables Af(Q, 1). Accordingly, the expected 
value of the realized variance is 

n 

E[[X,X] r ] = A5> 2 (l + 7 2 ), 7*=-VA. (11) 

This recovers the well-known fact that the realized variance is in general 
biased for non-zero drift, and is non-biased only for zero-drift (ji(t) = 0). 

2.2 Beyond realized variance with new estimated vari- 
ance estimators D est (T) 

The essential idea of the present work is that it is possible to improve 
on the realized variance estimator of the integrated variance estimator, for a 
fixed n vT of time-steps with durations A, by replacing it by 

n 

D eBt {T) = D DSt {X(t) :teSi}, (12) 
i=i 

where the functional D est {X(t) : t G §j} is an improved estimator of the 
spot variance given by definition [21 The subscript est is used to refer to 
some particular estimator and the subscript real means that this estimator 
reduces to the realized variance estimator. 
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Definition 4 The estimator D est (T) defined by (|12|) is said to be unbiased 
if, for all intervals i = 1, n, 



E D est {X(t) :teSi} = A • a\ , 



(13) 



which implies 



Tl 



E D est (T) =AJ2^ ■ 



(14) 



i=l 



When there exists at least one interval j, such that condition (TTBl does not 
hold, the estimator is considered biased. 

2.3 Estimator efficiency 

Let D est (T) be some unbiased variance estimator. We propose to quantify 
its efficiency in terms of the coefficient of variation 





(15) 



E[ J D es t(T)] 



As an illustration, the coefficient of variation of the realized variance for 
a Wiener process with zero drift (fi(t) = 0) is equal to 




(16) 



We will need the following theorem: 



Theorem 2.1 The lower bound of the function 



n 



n 




s — {si, s 2 , • • • , 




Vsi > 



(17) 



is equal to 





And this lower bound is attained iff all are identical: Sj = s > 0. 
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Proof. Let {sj} be a realization of some random variable S with proba- 
bilities Pr{5* = Si} — -, i = 1, . . . ,n. Expected and mean square values of 
the random variable S are equal to 

E[S] = ±5>, V[S>] = 1 -^. (19) 

i=l i=l 

Since, for any random variable S, the inequality a/E [S 2 ] ^ E [S 1 ] holds, this 
implies f(s) ^ -4^. The inequality becomes an equality iff all Sj = s for 
Vs > 0. ■ 

Applying this theorem to the right-hand-side of expression (f!6l) shows 
that p [[X, X] T ] satisfies the inequality 

p [[X,X] T ] ^ Preal(^), PrealW = , (20) 

where the lower bound Preai( w ) of the efficiency is attained only if all {<7j} are 
identical. 

Below, we will compare the efficiencies of different estimators via the 
comparison of their lower bounds 

p cst (n) = inf p cst [D(T)] . (21) 

3 Realized bridge variance estimators 

3.1 Basic definitions 

An important motivation for the introduction of a new class of so-called 
"realized bridge variance estimators" is to obtain much reduced biases com- 
pared that of the realized variance (jSJ) observed for nonzero drifts p(t) ^ 0. 

Definition 5 The bridge Y(t, in discrete time steps of the original pro- 
cess X(t) is defined by 

Y(t, Si) := X(t) - X 4 _x - t - Z ^ zl (Xt - Xi-x) , teSi, (22) 



where X^ := X(ti), := iA and A 



T 



As an example, let X(t) be the Wiener process with drift X(t,p,a) de- 
fined by (JZJ). Using the transition and scale invariant properties of the Wiener 
process leads to 

Y(t, ^) ~ a^A (W(() - (W(l)) , C = e (0, 1] . (23) 



This means that the bridge Y(t, Sj) f[2"2"j) is identical in law to 

Y(t,^)^aVAY((), (24) 

where 

Y(t):=W(t)-t-W(l), te(0,l], (25) 
is the canonical bridge whose basic properties are given in Appendix A. 

Remark 2 The canonical bridge Y(t) is completely independent of the drift 
fi. This property is the fundamental reason for the better performance of the 
variance bridge estimators compared with the realized variance: the biases 
and efficiencies of bridge variance estimators do not depend on the drift \i. 

In the following, we explore the statistical properties of the bridge vari- 
ance estimators 

n 

D est (T) = D es t{Y(t, Si) : t e §,} , (26) 

i=i 

obtained from the general expression f[T2"j) by replacing the initial process 
{X(ti)} by its corresponding bridge {Y(ij,§j)}. 

Definition 6 The estimator fl2"E|) is called homogeneous if, when applied to 
the Wiener processes with drift ([6]), the following identity in law holds 

D cst {Y(t, Si) : t e SJ ~ ^ 2 A • d es t , (27) 

where 

d est :=D cst {Y(t):te (0,1]} (28) 

is the canonical estimator of the spot variance depending on the canonical 
bridge Y(t) (|25|) . Obviously, the estimator §Zjgj) is unbiased if and only if 



Theorem 3.2 Under Assumption^ the lower bound of the efficiency of the 
unbiased homogeneous integrated bridge variance estimator (1271) is 



/n , (29) 
is the variance of the canonical spot variance estimator d est 

(ESD- 



PestW 



Var 



where Var 



d, 



est 
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Proof. Under Assumption [TJ the unbiased homogeneous bridge variance 
estimator (12B"j) is identical in law to 



est i 



i=l 



where {rfest} are iid random variables with mean value E 



d. 



est 



(30) 



1 and 



variance Var 



est 



. Accordingly, the expected value and variance of the 



unbiased bridge variance estimator are equal to 



E 



D est (T)] = A ^ a ^ Var ^est(T) 



A Var 



£j$>?. ( 31 ) 



1=1 



i=l 



Substitute these relations into (fT5|) . we obtain 



4st(T) 



\ 



Var 



4J Xa 4 / XX- 



(32) 



i=l 



i=l 



Using theorem I2.1[ this yields the result (|29|) . 

3.2 Simplest bridge variance estimator 

Our first example of an homogeneous bridge variance estimator is 

n 

D s im.ple(T) = -D simp i c {V(t, §j) : t G Si}, 

i=l 



(33) 



where the estimator of the spot variance is given by 
D simplc {Y{t,Si) :teS i } = AY 2 (t t (r ] )), ti{rj) = t^+rj-A, r\ G (0, 1) , (34) 

and A is a normalizing factor. The estimator Z) S i mp i e (T) is homogeneous and, 
if relations (El) are valid, then 



Y%( V ),Si)~a>A.Y>( V ) 



(35) 



where {Yi(rj)} are iid random variables that are identical in law to the canon- 
ical bridge (1251) . Substituting relation (I3"5"j) into ( 155]) leads to the identity in 
law 



Aimple^-AA^lf^) 



(36) 



i=l 
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The fact that the canonical bridge Y(yj) is Gaussian with mean value E[F 2 (?])] : 
7/(1 — implies that the estimator f[3"3"j) is unbiased in the sense of definition 
H]if A = 1/77(1 — 77). Accordingly, the variance of the estimator (133]) is equal 
to the variance of the realized variance obtained for zero drift (fi(t) = 0): 

n 

Var[Aimpie(T)] = 2A 2 ^ a\ . (37) 

i=i 

This result means that the lower bound of the efficiency of the simplest bridge 
estimator (1331) is equal to the lower bound of the efficiency of the realized 
variance estimator at zero drift: 

[2 

Psimple(n) = Preal(^) = \ ~ ■ (38) 

V n 

The shortcoming of the estimator (1331) is that it is actually less efficient than 
the realized variance at zero drift in a sense discussed below. 



3.3 Comparative efficiencies of realized variance esti- 
mators 

Definition 7 Let the estimator of the spot variance 

t) cst {X(t) : t G Si} or D est {Y(t, S*) : t G §;} 

depends on K est values of the process X(t) or Y(t, Si) at n cst time-step within 
the time interval t G Sj. The corresponding estimators of the realized volatil- 
ity D est (T) (fT2"|) or (13^1) are then using a total number n eff = K est • n of 
time-steps. 

Example 1 The realized variance corresponds to K rea i = 1. Indeed, the two 
values Xi) are used to estimate the spot realized variance (jlj), and the 

first value is excluded from the semi-closed interval Sj ([3]). 

Example 2 For the simplest bridge estimator (1331) with (13^1) . Simple = 2. 
Indeed, the estimator (jM]) depends on the bridge F(tj(?7),Sj) for tj(^) G §i 
and Y{ti(rj),Si) (T221 is defined by the open and close values {JQ_l,JQ} of 
the original stochastic process -X"(t). Excluding the open value, this yields 

^simple 2. 

Example 3 Consider the Garman & Klass (G&K) variance estimator based 
on open, high, low and close prices, used as the spot variance estimator in 
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expression f[T2"j) : 



D GK {X(t) : t E E>i} = ki(Hi - Uf - hidiHi - Li) - 2H l L l ) - k a Cf, 
h = 0.511, k 2 = 0.019, k 3 = 0.383, 

(39) 

where {Oi, Cj, Hi, Li} are the open, close, high and low values 

O i = X i - 1 , Ci = Xi, Hi = mv[X(t)-Oi], Li=wS[X(t)-Oi]. 

Excluding the open value leads to kqk = 3. 

Definition 8 We characterize the efficiencies of the novel variance estima- 
tors by comparing them with that of the standard realized variance estimator. 
The corresponding comparative efficiency lZ est is constructed as the ratio of 
the lower bounds of the efficiencies of the realized variance and novel variance 
estimator: 

n cst = Preal(/te ; t ; n) . (40) 

Putting in this expression p rea i(^) given by equation fl20l and p es t(n) given 
by expression (1291) yields 



K cst = J — . (41) 

V K ost • Var[d est J 

Remark 3 For a given duration T used to define the integrated variance 
([2]), relation ( l4ip takes into account that the typical waiting time between 
successive data samples is given by A c g ~ T n e s. Such waiting time should 
be approximately the same for the different generalized variance estimators 
proposed below, leading to similar distortions to the adequacy of the Ito 
process ([1]) in its ability to describe the real price process in the presence of 
discrete tick-by-tick and other microstructure noise. 

Example 4 Let us come back to the simple variance estimator based on 
expression (13~4"1) for D simp i e {Y (t, S^) : t G Si}. The result (13"81 is equivalent to 
Var[<i sim p le ] = 2. Substituting this value in f l4"T|) yields 

^ simplc = J: — = ~7k ~ °- 707 • ( 42 ) 

V ^simple V " 

The efficiency of the simplest bridge estimator is smaller than that of the 
realized variance. 
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Example 5 Let us evaluate the comparative efficiency of the generalized 
realized variance estimator based on the spot G&K variance estimator in the 
case of zero drift fi(t) = 0. It is known that the variance of the spot G&K 
variance estimator given by (139]) is equal to 



Var 



D G K{X(t) -.teSA 



afA ■ 0.2693 



This gives 



Var 



d, 



GK 



0.2693 . (43) 



K 



GK 



k gk ■ 0.2693 



3 ■ 0.2693 



1.573 . 



(44) 



Therefore, for zero drift, the G&K realized variance estimator is approxi- 
mately 1.6 times more efficient than the realized variance estimator. 



3.4 High bridge variance estimator 

The fact that the G&K realized variance estimator based on open-high- 
low-close prices is significantly more efficient than the standard realized vari- 
ance, at least for Ito process X{t) ([I]) with zero drift fi(t) = 0, suggests to 
study other estimators using different combinations of the open-high-low- 
close prices. Let us start by analyzing the simplest case of what we will refer 
to as the "high bridge variance estimator" , defined through its spot variance 
given by 

D hi &{Y(t,$ i ):teS i } = A-H* , (45) 
where A is normalizing factor and 

i?i = sup y(t, Si), (46) 

teSi 

is the high value of the bridge Y(t,E>i). Note that we use here the same 
notation for the high value of the bridge F(t,§j) as for that of the original 
process X(t), hoping that this will not give rise to any confusion. 
It follows from (JMD that 

D high {Y (t, St):tE S t } ~ afA ■ d high , d high = AH 2 , (47) 

where the high value H of the canonical bridge Y(t) f)25p has the following 
probability density function (pdf) 

^high(^) =Ahe~ 2h \ h>0. (48) 
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The derivation of the pdf (|48[) is given in Jeanblanc et al. (2009) (see also 
the derivations presented in Appendix B). Accordingly, the expected value 
and the variance of the square of H are given by 

E [H 2 ] = I, Var [H 2 ~\ = \. (49) 

In order for the high spot bridge variance estimator to be unbiased, we have 
to choose in fl51Q the value A = 2 for the normalizing factor. This gives 



Var dhigh = 1- With Khigh = 2, we find that the comparative efficiency (j4T]l 
of the high bridge realized variance estimator is TZ^igh = 1- Thus, the high 
bridge realized variance estimator has the same efficiency as the standard 
realized variance. But the advantage of the former is that, under Assumption 
([T]), it is unbiased for any drift fi(t) ^ 0. 

Remark 4 Let us give the intuition for the above result, obtained despite 
the larger value of /thigh = 2 compared to K rea i = 1- The reason is that 
the pdf of the random variable 2H 2 is narrower than that of the random 
variable W 2 defining the spot realized variance at zero drift. The same reason 
underlies the comparative efficiency of the G&K as well the other high and 
low bridge realized variance estimators discussed below. The narrowness of 
the pdf's of high's and low's compared with the pdf's of the increments of 
the original stochastic process X(t) results from a weak version of the Law of 
Large Numbers, in the sense that the high's and low's incorporate significant 
additional information about the underlying process within a given time-step, 
thus leading to narrower pdfs'. 



3.5 Time-high bridge variance estimator 

We now introduce a novel ingredient to improve further the estimation 
of the variance. In addition to using only the high Hi of the bridge Y(t, Sj), 
we also assume that the time t high of the occurrence of this high is recorded: 

^high : Hi = Y(t l high ,Si) . (50) 
The corresponding time-high bridge spot variance estimator is given by 

D cst {Y (t, Si) : t £ §,} = A ■ s ^W-^-i j . H 2 ^ (51) 

where A is a normalizing factor, while s(t),t £ (0, 1) is some function that 
remains to be determined so as to make the above spot variance estimator 



14 



as efficient as possible. Before providing the solution of this problem, let us 
note that the following identify in law follows from 



D eBt {Y(t, Si) : t e Si} ~ of A ■ d est 



(52) 



where 



4st = A ■ s (t high ) • H 2 (53) 

is the canonical time-high bridge estimator of the spot variance, H is the 
high value of the canonical bridge Y(t) (1231) . and thigh is the corresponding 
time-point (15"U1) . 

The expected value of the canonical estimator (153]) is equal to 









E 


d e st 








Jo 



A s(t)a(t;2)dt, a(t; A) := / h A (p high (h,t)dh (54) 



where y?hi g h(^, t) is the joint pdf of H and thigh- Taking 



A = 1 



s(t)a(t;2)dt 



we obtain an unbiased time-high canonical bridge estimator: 

s (thigh) H 2 



d, 



est 



Its variance is 



Var 



d, 



est 



/ s(t)a(t; 2)dt 

/„* s 2 (t)a{t;4)dt 
J, 1 s(t)a(t;2)dt\ 



1. 



s t-high(^) 



a(t;2) 



a(t;4) 

JTie corresponding minimal variance is equal to 

*t-high (^high)-f^ 



Var 



Jl s(t)a(t; 2)dt 



inf Var 

v«(t) 

1 ~,2 



^t-high 



" 1, 



^•t-high 



a 2 (t;2) 
o a(*;4) 



(55) 



(56) 



(57) 



Theorem 3.3 The function s(t) that minimizes the variance ( |57|) of the 
unbiased time-high canonical bridge estimator (f56j) 



(5f 



(59) 
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Proof. We use the Schwarz inequality 

2 



/ A{t)B(t)dt] ^ [ A 2 {t)dt [ B 2 {t)dt 
Jo J Jo Jo 



with 



to obtain 



A(t) = s(t)y/a(t;A), B(t) 



s(t)a(t;2)dt) ^ / s 2 (t)a(t; 4)dt 



a(t;2) 

1 a 2 (t;T 



dt. 



a(t;4) 

After simple transformations, we rewrite the last inequality in the form 



Var 



d t . 



high 



/ \ s 2 (t)a(t;A)dt 
fa s{t)a(t;2)dty 



- 1 > 



1 



J0 a(t;4) UL 



- 1 . 



(60) 
(61) 

(62) 
(63) 



The equality in (I6"3"j) is reached by substituting in it s(t) = s t -hi g h(£) given by 
expression (158]) . ■ 
The joint pdf of H and thigh is derived in Appendix B and reads 

h 2 ( h 2 



Vhigh(M) 



v^(i-*) 3 



cxp 



2t(l - t) 



h>0, 



Substituting this expression for y?hi g h(^, t) into yields 



a(t; A) 



7T 



[2t(i -t)]2 r 



3 + A 



t G (0,1) . 
(64) 

(65) 



Therefore, 

s t-high(^) 

and 



5t(l-t) 



Tit. 



^t-high 



high 



3 
5 



Var 



d t . 



high 



2 
3 



- ~ 1.225 . 



(66) 



(67) 



Thus, the time-high bridge realized variance estimator is less efficient than 
the corresponding G&K estimator at zero drift, but is more efficient than the 
realized variance. 

Remark 5 The numerical result fl67j) takes into account that the use of t high 
does not increase the number of sample values used in the spot estimator 
flSTJ- Thus, Kt-high = ft h ig h = 2. 
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4 Bridge time-high-low estimators 
4.1 Bridge Parkinson estimator 

Definition 9 The bridge realized variance estimator ( 126]) that uses as spot 
variance estimator 

D h p aik {Y(t, Si) : t G §,} = A ■ (H t - Li) 2 (68) 



is called the bridge Parkinson estimator. In expression fl68j) . Hi and Li are 
the high and low values of the bridges Y(t, Sj 



The bridge Parkinson estimator is identical in law to 

D hPavk {Y{t, E>i) : t G §*} ~ a 2 A ■ 4 Park , 4park = A ■ (H — L) 2 , (69) 

where H, L are the high and low values of the canonical bridge Y(t) (j25p . 
The joint pdf of H and L have been derived by Saichev et al. (2009) and 
reads 

oo 

if(h,£)= m[mZ(m(h - £)) + (1 - m)T(m(h -£)+£)}, 

m=— oo V ^) 

1(h) = A(Ah 2 - 1) e' 2h \ 

It will be clear below that it is convenient to describe the joint statistical 
properties of the high H and low L by using polar coordinates 

H = RcosB, L = RsmB, Re(0,oo), 9 G (~~,o) . (71) 

Accordingly, we rewrite the canonical estimator (169]) in the form 

4park = Aii! 2 (1- sin 29) . (72) 
Choosing the constant A that makes the estimator (1691 unbiased, we obtain 

_ fl 2 (l-sin2Q) 
£ /2 (l-sm20)a(0;2)d0 f . 

a(0;A) = / r A+1 (/)(r cos 0,r sin 0)cfr. 
Jo 

Substituting expression ( 170]) yields 

a(0; A) = 

00 

m [m/3(m(cos0 — sin#); A) + (1 — m)(3(m(cos9 — sin#) + sin^; A)] , 



■m=— 00 



2* 



(74) 
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The variance of the canonical bridge Parkinson estimator is equal to 



Var 



'bPark 



J° T/2 (l-sm29) 2 a(9;4)d9 



(/°„ /2 (l-sin20)a(0;2)d0 



0.2000 . 



(75) 



Substituting this value into ( |4T|) and taking into account that KbPark = 3 for 
the bridge Parkinson estimator, we obtain the comparative efficiency 



Var 



d 



bPark 



0.2000, KbPark = 3, 



ft bPark ~ 1.823 , (76) 



which means that the bridge Parkinson estimator is significantly more effi- 
cient than the G&K estimator at zero drift. 

Remark 6 We stress that the canonical estimator c4,p a rk is significantly 
different from the well-known canonical Parkinson estimator (see Parkinson 
(1980)) 

{H-Lf 



d 



Park 



41n2 



(77) 



where H and L are the high and low values of the canonical Wiener process 
with drift X(t,j) (jHJ). In contrast with the bridge Parkinson estimator (ITS]) 
which is unbiased for any 7, the standard Parkinson estimator is biased at 
nonzero drift. Moreover, the variance of the standard Parkinson estimator 
at zero drift is 



Var 



d 



Park 



0.4073 



(78 



which is approximately twice the variance of the bridge Parkinson estimator 
761). 



4.2 Non-quadratic homogeneous estimators 

Until now, we have considered homogeneous (in the sense of definition [6]) 
high-low estimators that are quadratic functions of the high and low values. 
We now consider the more general class of homogeneous estimators, whose 
spot variance estimators have the form 

D eBt {Y{t, Si) : t E Si} = V est {H h Li) , (79) 

where T> est (h,£) is an arbitrary homogeneous function of second order. 

Example 6 To illustrate the notion of non-quadratic homogeneous func- 
tions of second order, consider the typical example 

V cst (H t ,L t ) = ^££= (80) 
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which satisfies the scaling property 

V est (6 -H u 6- U) = 6 2 ■ V cst (H u V5 > . (8i; 



The following theorem states that the spot variance estimator (179)1 satis- 
fies the relations (1271) . (125)1 of definition [HI for homogeneous estimators. 



Theorem 4.4 T7ie spoi variance estimator (179)) zs homogeneous in the sense 
of definition 0. 

Proof. Let iJj and Lj be the high and low values of the bridge Y(t, §,). 
Due to relation (1241) and Assumption [TJ, the following identity in law holds 

{H u L t }~ ai VA-{H,L}, (82) 

where {H, L} are the high and low values of the canonical bridge Y(t) (|25p . 
Substituting this last relation into (179]) yields 

Dest(#i, ^i) ~ Pest(^VAF, CTjV^AL) . (83) 

Using the homogeneity of the function D(h,£), we rewrite the previous rela- 
tion in the form 

V cst (H t , U) ~ a^Aest • L) , (84) 

which is analogous to expression (127)1 . where the canonical estimator of the 
spot variance is equal to 

Lt = V est (H,L) . (85) 



Using the polar coordinates (ITT)) , the canonical estimator d es t reads 

d est = V est (Rcos 0, i?sin 0) . (86) 
Using the homogeneity of the function D est , we obtain 

d est = R 2 -s(6), s(9) = V cst (cos6,sm6) . (87) 
Its expected value is equal to 



E 



d P 



o 



s(6)a(9;2)d6 

tt/2 



where the function a(8, A) is given by the equality (1731 . Thus, the homoge- 
neous non-quadratic canonical estimator reads 

= f° R (Tl 9Wfl E[4 t ] = l- (89) 

J_ 7r/2 s{e)a{9;2)d9 
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Accordingly, the variance of the unbiased estimator is equal to 

J°s*(9)a(9;4)d9 



Var 



gL 



- 1. 



J° w/2 s(9)a(9;2)d9 



(90) 



One can easily prove the result analogous to theorem 13.31 that the mini- 
mum value of the variance (190]) of the canonical estimator fl89|) with respect 
to all possible functions s(9) is given by 



Var 



(L 



inf Var 

Vs(9) 



d, 



est 



1 



P 



1. 



a 2 (9;2) 



tt/2 



a(0;4) 



d6, (91) 



where d est is an arbitrary homogeneous canonical estimator of the form 
while d me is the corresponding most efficient estimator given by 



dr, 



a(9;2) 



(92) 



£ me a(6»;4) 
Calculating the numerical value of the integral in expression f l9Tj) yields 

4e = 0.1974, Kmc = 3, =4- ft mc ~ 1.838 , (93) 
which shows a high efficiency compared with the standard realized variance. 



4.3 Time- high- low homogeneous estimator 



Let us consider the unbiased homogeneous time-high-low canonical esti- 
mator 

R 2 s(Q,t last ) 



d 



est 



(94) 



Io dt lL/2 de s(9,t)a last (9,t;2Y 

where s(9,t) is an arbitrary function, i last = sup{ti,t^} is the larger of the 
two times at which occur the high and low values of the canonical bridge and 
Oiiast{9,t; A) is given by (1C.17j) in Appendix IC.31 

It is easy to prove the result analogous to theorem 13.31 that the most 
efficient estimator of the form fl94"l) is 



d t - 



R 2 «last(©,^l 



£t-me O!last(0> ^lasti 4) 

and the variance of this estimator is equal to 



dt 



d9 



Q 



lastly 5 



t:2) 



n/2 



a last {9,t;AY 



(95) 



Var 



d t - 



1 . 



(96) 
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The numerical calculation of £t_me gives 



Var 



0.1873, K t _ me = 3, 1l t _ me ~ 1.887. (97) 



The estimator of the realized variance based on the canonical estimator (|95|) 
is significantly more efficient than that based on the G&K estimator at zero 
drift. 

4.4 High-low-close bridge estimator 

Until now, we have not used explicitly the information contained in the 
close values Xj (jlj) of the time-step intervals §j (j3J). The close values Xj 
have been used only for the construction of the bridge Y(t, §*) (12"2"1) . It 
seems plausible that taking into account explicitly the close values Xj in 
the construction of spot variance estimators may produce bridge realized 
variance estimators D est (T) = Y^h=i D est {Y(t, §j) : t G §j;Xj} that are even 
more efficient than those considered until now. We show that this is indeed 
the case by studying the example associated with the spot variance estimator 
given by 

D cst {Y(t, Si) : t G St; X} = V cst (H u L h X,) , (98) 

where V est (h,£,x) is an arbitrary homogeneous function satisfying relation 
(I8ip . Due to its homogeneity, the following identity in law holds true 

V est (H h L i} ~ of A • 4*, 4st = V est (H, L, X), (99) 

where H and L are the high and low values of the canonical bridge (|25|) , while 
X = 7 + W is the close value of the underlying canonical Wiener process 
with drift 0- It is known (see, for instance, Jeanblanc et al. (2009)) that 
the canonical bridge Y(t) and W are statistically independent. Thus, the 
joint pdf <f(h, £, x) of the three random variables {H, L, X} is equal to 

<p(h, i, x;i) = ^= exp (-^JT) <p(h, I) , (100) 



/2tt 

where the joint pdf (p(h, £) of high and low values is given by expression (1701) . 

Analogously to (1501) . it is convenient to represent the canonical estimator 
d es t (19"9"1) in the spherical coordinate system 

H = Rcos T cos 0, L = Rcos T sin B, X = i?sinT, 

(101) 

T G (-7r/2,7r/2), 9e (-tt/2,0). 
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Th canonical estimator d est f[9"9"|) then takes the form 

d es t = R 2 s(Q,T), (102) 

where 

s(0, T) = £> est (cos T cos 6, cos T sin 0, sin T) . (103) 

Analogously to ( 151]) and (1551) . the unbiased most efficient high- low- close 
canonical estimator is given by 

dme-x = ~~pr~ -R 2 s me _ x (0, T; 7), s me - x (0, u; 7) = "^'^'^l' 7 ! ■ (104) 
Wx t>; 4; 7) 

The function a(0, v; A; 7) is defined by the equality 

POO 

a(0, t>; A; 7) = / r x+2 (p(r cosv cos 0, r cost; sin 0, r sin v; ^)dr . (105) 

The variance of the most efficient canonical estimator G? me - X is equal to 

Var r ' ^ 

with 



1, (106) 



ime-x= / dO dvcosv—- — -. (107) 

The calculation of the integral fll07p for 7 = gives 



Var 



dme-x 



0.1794, Kme-x = 3, => TZme-x ^ 1.928 . (10* 



This estimator is definitely better than the most efficient time- high-low canon- 
ical estimator, as can be seen by comparing fllOSp with ([97)1 . 

4.5 Time-high-low-close bridge estimator 

The last example we present here is the realized variance estimator that 
uses in each interval §j the high and low values Hi, Li of the bridge Y(t, Si) 
([22]) . the close value X 4 of the original stochastic process X(t) and the time 
instant tj ast = sup{f L ,f H } defined as the larger of the two times at which 
occur the high and low values of the canonical bridge. 

One can rigorously prove that, analogously to (11041) . the homogeneous 
time-OHLC bridge canonical estimator that is most efficient for some given 
value of 7 value is equal to 

^t-me-x(0, T, tlast! j) = -R 2 <St-me-x(0, ^\ ^lafltj 7)j 

, a 1 a(e,v,t;2;j) ( 109 ) 

^t-mc-x(7) Qt(0,u,t;4;7) 
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where 



£t- m Ul) = dt dd dv cosv }' ' ' ' t ( 11Q ) 



and 

a(9,v,P, A; 7) 



POO 

/ r x+2 (fi ast (r cosv cos 9, r cos v sin 9, r sin v, t; ^y)dr. (Ill) 
Jo 



The joint pdf <p(h, £, x, t; 7) is 



1 / ( x _ 7 )2^ 
Viast(^, t, x, t; 7) = — == exp ) (piaBtQi, t, t), (112) 



'2tt 

where ipia&t(h,£,t) is given by expression (1C.15j) in Appendix IC.21 

Remark 7 Recall that the parameter factor 7 (jHJ) is unknown, because both 
the drift and the instantaneous variances of in equations are generally 
unknown. Therefore, our strategy below is to choose, for defmiteness, 7 = 
and then explore the dependence on 7 of the bias and efficiency of the 
different "zero drift" estimators. Accordingly, we will use below the following 
shorthand notations, omitting the argument 7, such as 

^t-me-x(©, T, t) := <it-me-x(@, T, t] 7 = 0). 

The calculation of the integral (11 101) . where a(9,v,t; X) is given by ex- 
pression (jC.18p in Appendix 10.41 yields for 7 = 



Var 



d t - 



" - 1 - 0.1710, Kt-me-x = 3, H t _ mc _ x ~ 1.975 



^t-me-_ 

(113) 

This estimator is more efficient than all the previous one discussed until now. 



5 Numerical simulations and comments 

5.1 Description of numerical simulations 

The goal of this section is to check by numerical simulations some ana- 
lytical results obtained above. Realizations of the canonical Wiener process 
X(t; 7) (Q with drift for time t G [0, 1] are obtained numerically as cumula- 
tive sums of a number I(t) = 10 5 of Gaussian summands, corresponding to 
a discrete time step A = 10~ 5 . For each numerical realization, we calculate 
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the values of the open-close spot variance canonical estimator, equal in this 

case to 

L,\ = (l + W)\ (114) 
and the values of the G&K canonical estimator 

d GK = h(H - L) 2 - k 2 {W{H — L) — 2HL) - fc 3 (7 + W) 2 , (115) 

where H and L are the high and low values of the simulated process X(t; 7). 

We also constructed numerical realizations of the bridge process Y(t) fT2B]) 
and calculated the corresponding values of the canonical estimator d t _ mc _ x 
(11091) . This estimator depends on the function a(9,v,t;X) defined by ex- 
pression (1C.18j) in Appendix \CA\ which is explicitly obtained by summing 



a double-infinite series (1C.19j) . In practice, we estimate this double-sum by 
keeping only the 101 first terms in each dimension, corresponding to estimat- 
ing 101 x 101 ~ 10 4 summands in (1C18[) . 

Remark 8 At first glance, it would seem that the calculation of the G&K 
estimator (I115p . which needs only a few simple arithmetic operations, is 
much easier than the evaluation of the large number of summands in the 
series (1C.18P that define the estimator <i t _ me _ x (11091) . In our computerized 
world, it turns out that there is actually no significant difference from the 
computational point of view. 



5.2 Statistics of the estimators in the case of zero drift 

(7 = 0) 

Figure 1 shows 5000 realizations of the open-close estimator d re3i \ (11141) . of 
the G&K estimator fl 1 1 5 [) and of the estimator d t - m e-x * n the case the Wiener 
process with zero drift (7 = 0). It is clear that the last estimator is the 
most efficient in comparison with the open-close and the G&K estimators. 
The expected values and variances of these three estimators obtained by 
statistical averaging over 10 4 samples are 

E[rf rca i] ^ 1-0110, E[d GK ] ~ 1.0058, EK mc . x ] ~ 1.0001, 
Var[d rca i] ~ 1.9947, Var[d GK ] ~ 0.2669, Var[rf t _ mc _ x ] ~ 0.1696. 

These values are consistent with the theoretical analytical predictions ob- 
tained in previous sections: 

E[d roal ] = E[rf GK ] = E[c? t _ me _ x ] = 1, 
Var[rf rca i] = 2, Var[d GK ] ^ 0.2693, Var[J t . mc . x ] ~ 0.1710. 
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In order to have truly comparable efficiencies of these realized variance 
estimators, bearing in mind that their effective sample sizes are different 
(^reai = 1, ^gk = ^t-mc-x = 3), we performed moving averages with r = 30 
subsequent samples for the open-close estimator (1114ft and with r = 10 sub- 
sequent samples for the G&K estimator fl 11 5 j) and estimator dt-me-x (H09P - 
Figure 2 presents there moving averages, which mimick the normalized es- 
timators of the integrated variance in the case where all instantaneous vari- 
ances are the same (of = o 2 = const). It is clear that the open-close estima- 
tor of the realized variance remains significantly less efficient than the G&K 
estimator, and much less efficient than the most efficient estimator oVme-x- 



5.3 7-dependence of biases and efficiencies of canonical 
estimators 

In the previous subsection, we presented detailed calculations of the com- 
parative efficiency of unbiased variance estimators for the particular case of 
Wiener processes with zero drift. In real financial markets, the drift process 
fi(t) is unknown and there is not reason for it to vanish. Thus, it is important 
to explore quantitatively the dependence on the parameter 7 (jUJ) of the biases 
and efficiencies of the spot variance canonical estimators described above. 

We begin with the open-close spot variance canonical estimator d Tea \ f ll 14j) . 
It is easy to show that its expected value and variance are quadratic functions 
of 7: 



E 



real 



1+7' 



Var 



d 



real 



2 + 4 7 2 



(116) 



The spot variance homogeneous time-open-high-low canonical bridge estima- 
tors, such as the Park estimator (ibPark flZHJ) and the time-high-low estimator 
dt-me ( 195|) . are unbiased for all 7: 



E 



d 



bPark 



E 



dt. 



1 . 



Their variances do not depend on 7 at all: 



Var 



d 



bPark 



0.2000, 



Var 



dt- : 



0.1873 



V7. 



;ii7) 



To obtain the 7-dependence of the biases and variances of the G&K canon- 
ical estimator <i GK (11 15[) and of the canonical estimator <i t _ me _ x (I109p . we gen- 
erate 10 4 numerical realizations of the canonical Wiener process X(t, 7) 
with drift, for 7 = 0; 0.1; . . . 1.5; 1.6. Then, we calculated the statistical av- 
erages and variances of the corresponding 10 4 realizations of the canonical 
estimators dcK and d t . mM , which are shown in figure 3. The continuous lines 
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are respectively the expected value ( 11161) of the open-close estimator rf rea i 
given by expression (11141) and the fitted curves 



E[dest] = a es t7 2 + b, 



est 



for the averaged values of the canonical estimators g?gk an d d t _ me _ x . Their 
fitted parameters are 

a GK ~ 0.126, cit-me-x ^ 0.082, b GK ~ 6 t _ mc _ x ~ I. 

Figure 4 shows the statistical average of the variances of the canonical 
estimators c/gk and d t _ me _ x . The two horizontal lines indicate the variance 
values flll7p . The continuous lines show the fitted curves 

Var[c/ est ] = c est 7 2 + d est 

of the variances of the canonical estimators Jgk and d t _ me _ x . Their parameters 
are 

c GK ^ 0.089, c t . me . x ~ 0.0272, doK — 0.271, 6 t . mc . x ~ 0.170. 



5.4 Construction of general variance estimators 

We have introduced the canonical estimator d t . me . x given by expression 
(11091) that includes the information on the value of the time ti ast = sup{t L , t#} 
defined as the larger of the two times at which occur the high and low values 
of the canonical bridge. It seems that the canonical estimator 

^■tt-me-x ( O , T, thigh, tlowi T) , ^tt-me-x(0) thigh; tl ow , 7); 

, a , . , 1 a{0,vMM-a;i) (118) 

^tt- mc -x(7) a{9,v,t 1 ,t 2 ;4:;'y) 

taking into account both high's and low's and their corresponding occurrence 
times (thigh : H = V(thigh)>^iow : L = Y(t\ ow )) is even more efficient than 
the estimator (11091) . In expression (11181) . we have used the notation 

POO 

a{9, v, ti, ti\ A, 7) = / r x+2 (p(r cosv cos 9, r cosv sin 9, r sin v, ti, t2; 7)<ir , 

(119) 

where </?(/i, £, x, ti, t 2 ; 7) is the joint pdf of the high-low-close-thi g ht-tiow random 
variables. 

We have not explored the statistical properties of the estimator f II 1 8 j) 
because we have made not yet the effort of deriving the exact analytical 
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expression of (p(h,£,x,ti,t2',j). We can however construct the function a 
(11191) using statistical averaging: 

1 - 

a(9,v,ti,t 2 ; A, 7) cos vdvd9dhdt 2 ~ — } j R^I (T k , 9 fc , t his h,k, ti ow ,k) ■ 

k=l 

(120) 

In this expression, the values {T fc , fc , t high fc , £i OWjfc } are parameters of nu- 
merically simulated k-th sample of the canonical Wiener process with drift 
X(t, 7) ([9]), and I is the indicator of the set 

(v, v + dv) x (0, 9 + dO) x t x + dti) x (t 2 , t 2 + dt 2 ) . 

We would like to point out that it is possible to construct the function a 
by an analogous statistical treatment for more general log-price process that 
extend the Wiener process with drift to include more adequately the micro- 
stricture noise, the presence of heavy tails of returns and other stylized facts 
that can be found for various financial assets. In others words, relations such 
as (I120p offer the possibility of constructing novel most efficient variance 
estimators of the form fl 11 8 j) . extending the standard approach of econome- 
tricians looking for new constructions of efficient volatility estimators. The 
requisite is to be able to simulate numerically the underlying stochastic pro- 
cess that is representing a given financial asset dynamics. Then, the use of 
statistical averaging, similar to (I120p . will enable the construction of high- 
frequency realized estimators that use the most efficient estimators described 
above as elementary "bricks" . 

6 Conclusion 

We have introduced a variety of integrated variance estimators, based on 
the open-high-low values of the bridges Y(t,Si) (122]) . and close values X* (jl]) 
of the underlying log-price process X(t). The main peculiarity of some of 
the introduced estimators is to take into account not only the high and low 
values but additionally their occurrence time. This last piece of information 
lead to estimators that are even more efficient. We discussed quantitatively 
the statistical properties of the estimators for the class off Ito model for the 
log-price stochastic process. 

Our work opens the road to the construction of novel types of integrated 
variance estimators of log-price processes of real financial markets that take 
into account the microstructure noise, heavy power tails of returns, and 
chaotic jumps. 
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A Basic properties of the canonical bridge 

A.l Symmetry properties 

The canonical bridge Y(t) (j25p exhibits the following time reversibility 
and reflection properties 

Y(t)~Y(l-t), Y(t)~-Y(t). (A.l) 

Some statistical consequences of these symmetry properties are as follows. 
Let 

H= sup Y(t), L= inf Y(t), (A.2) 
te(o,i) ^(o- 1 ) 

be the high and low values of the canonical bridge, while thigh and ti ow are 
their corresponding occurrence times: 

thigh : H = y (thigh), t\ ow : L = K(ti ow ). (A. 3) 

Consider the cumulative distribution (cdf) 

$hi g h(t) = Pr {thigh < t} 

of the occurrence time thigh of the high value of the canonical bridge. Due to 
the reversibility property flA.ip . one has 

Pr{t hi gh < t} = Pr{t hi gh > 1 - t} => $ hi gh(t) + $hi g h(l - t) = 1. (A.4) 
Accordingly, the pdf of thigh 

, hl!h(i) ;= 

presents the symmetry 

^hi g h(*) = y?hi g h(l - t). (A. 5) 

Due to the reversibility property of the canonical bridge, the cdf $i ow (t) 
of ti ow flA.3[) coincides with the cdf of thigh : 

$low(t) = $high(t) <^low(t) = <£high(t) = y?high(l - t). 
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A. 2 Interplay between bridge and Wiener processes 

We will need below the well-known identity in law for the canonical bridge 



Y(t) ~y(t) := (l-t)W 



l-t 



Using the change of time variable 

t 



t 



l-t 1 + T 

and the scaling properties of the Wiener process, we can replace the com- 
pounded process 

y{t{r)) = y ( T 



1 + 

by the more convenient process, which is identical in law and reads 

y(t(r)) ~ Z(r) = t^-W(t). (A.6) 
In turn, the following identity in law holds 

Y(t) ~ Z(r(t)) = Z (J-^ . (A.7) 

B Joint pdf of the high value and its occur- 
rence time 

B.l Reflection method 

Let us consider the function f(uu; r, h) such that 

Pt{W(t) G (cj, 00 + dco) n W(t') < h(l + t') : r' G (0, r)} = f(co; r, h)du. 

(B.l) 

This function f{u;r, h) satisfies to the following diffusion equation 

Of 1 d 2 f oA 

^ = 2 (R2) 

with initial and absorbing boundary conditions 

/(w;r = 0,/i) = <J(o;), f(u = h + hT\T,h) = 0. (B.3) 
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We solve the initial-boundary problem ( \B.2\i with f)B.3|) using the reflec- 
tion method, which amounts to searching for a solution of the form 



f(u; t, h) 



V2 



ITT 



, " 2 \ A ( 



where the factor A is defined from the absorbing boundary condition (1B.3|) . 
i.e. 



2t 



2t 



A = e 



We thus obtain 



f(u;r, h) 



V2 



TIT 



u 2 \ ( nlJ (uj-2h) 2 ^ 1 



(B.4) 



B.2 Pdf of the maximal value of the canonical bridge 

In view of (IB. ip and (1A.6[) . the joint pdf of W(r) and of the high value 

H{t) = sup Z(t') (B.5) 

r'e(0,T) 

of the stochastic process Z{r') within the interval r' G (0, r) is equal to 

df{u;r, h) 



Q(u,h;r) 



dh 



Substituting in the above equation the expression flB.4j) yields 



Q(u, h; r) = - J — (2h(l +t)-u) exp 

T \ TTT 



-2h> - (w - 2A)2 



2r 



(B.6) 



u<h(l + r), h>0. 
In particular, the pdf of the high value 7i(r) (IB.5[) 

-/i(l+r) 



Q(h;r) 



Q(co, h, r)du 



is equal to 
Q(h;r) 



2 ( h 2 (l + r' 2 
exp 



TTT 



2t 



+ 2he- 2h2 eric 



h(l 



2r 
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Using the identity in law (1A.7[) . the pdf Qhigh{h;t) of the high value 



H(t) = sup Y(t') 

t'e(o,t) 



is equal to 



Qu g h(h; t) = Q (h; Y~^) ' ( R7 ) 



In particular, the pdf 's of the high values H (1B.5j) and H f lA.2j) are the same 
and equal to 

<Pbi&{h) = lim Q(h;r) = Ahe~ 2h2 . (B.8) 

T— S>00 

B.3 Pdf of the high value of the bridge and of its oc- 
currence value 

In order to derive the joint pdf of the maximal value H flA.2[) and of the 
occurrence time thigh ( 1A.3j> . we first consider the related joint pdf of the high 
value H ( IB. 5}) of the auxiliary process Z{r) (1A.6j) and of its occurrence time 

r high : H — ^(^high)- 

The function F(h, r) that defines the probability 

F(h,T)dh = Pr{U G (h,h + dh),r high < r}. 

is given by 

rh(X+T) 

F{h,r)= Q{u,h;T)P{u,h,T)du, (B.9) 

J — oo 



where Q(co, h; r) is the joint pdf of W(r) and H(t), given by equality flB.6j) . 
while 

P(u), h, t) = lim P(u), h, r, 9), 

e ^°° (B.10) 
P(u, h, r, 9) = Pt{W(t'\t, u) < h(l + r) :r'e(T,r + 9)}. 

Here, W(t'\t, uj) is the conditioned Wiener process that takes the value u at 
t' = r. Due to the identity in law (1A.6[) . P(u, h, r) is equal to the probability 
that the following inequality holds 

Z(t'\t,u) < h, t' € (r, oo), 

where Z(t'\t,u) is the conditioned stochastic process Z(t'), which is equal 
to ujj (1 + r) at t' = r. 
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The probability P(co, h, r, 9) (IB.lOp is given by 

»h(i+T+e) 



P(u,h,r,9) 



f(x; u, h, r, 9)dx, 



(B.n; 



where the pdf f(x, u, h, r, 9) satisfies the initial-boundary value problem 

df = l&f 
09 2 dx 2 ' 

f(x; u, h,T,9 = 0) = 5(x - tu), f(x = h(l + T + 6);uj, h, r, 9) = 0. 



Its solution, obtained by the reflection method, is 



f(x;uj,h,r, 



1 



V2^9 



exp 



(x — u)' 
29~ 



exp -2h{h{\ + r) - to) 



(x + u - 2h(l + t)Y 
29 



Substituting this last expression into flB.llj) yields 

P(u,h,r,9) = 



erfc 



u-h(l+r + 



29 



e - 2MM i +r) - W ) erfc ( h(i + r-e)-u 

V29 



In particular, in the limiting case 9 — > oo, one has 

P(u,h,r) = i_ e -2MMi-Hr)-«-). 



(B.12) 



Substituting Q(u,h;r) flB~6l) and P(uj,h,r) (lB~T2l into (1B~9|) . after integra- 
tion, we obtain 



r) = 2/mT 2/i erfc 
Consider now the probability 

®u s h(h,t)dh = Pr{H e (h,h + dh),t high < t}. 
Due to the identity in law (1A.7I) . $ugh(h, t) is equal to 



(B.13) 



$ bi &(h,t) = Flh,- 



2he- 2h2 eric 



h(l - 2t) 
V 2 ^(l - *) 



(B.14) 
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The integration over h G (0, oo) gives the cumulative distribution function 
(cdf) of the random occurrence times thigh (|A.3j) : 

POO 

$hi gh (t) = Pr{t hi gh <t}= $(h, t)dh = t, t G (0, 1). 

J o 

This means that the occurrence time thigh of the high value of the canon- 
ical bridge is uniformly distributed. The above cdf satisfies the symmetry 
property (IA.4j) . The corresponding pdf ^high(^) = 1 satisfies obviously to 
symmetry property (IA.5I) . 

The sought joint pdf of the high value H of canonical bridge Y(t) and of 
its corresponding occurrence time thigh is 

fugh(h,t) = ^ . (B.15) 

Substituting here $hi g h(M) QB. 14|) yields 

^{h,t) = ^- -^= ^{-^-^. (B.16) 

C Statistics of the high, low and occurrence 
time of the last extremum of the canonical 
bridge 

C.l Statistical description of the joint pdf of the high, 
low and occurrence time of the last extremum 

The occurrence times of the first and last absolute extremes flA.31) of 
canonical bridge Y(t) are formally defined as 

tflrst = illf {t L ,t H }, ti ast = SUp{t L ,t H } . (C.l) 

The joint pdf of the high H and low L (1A.2I) together with the cdf of the 
occurrence time t^t is given by 

$i ast (/i, £, t)dhd£ = Pr{H G (h,h + dh)f]L G (£, £ + di) n t last < t} . (C.2) 

We derive the function ^\ast{h,£,t) by using a natural generalization of the 
reasoning presented in Appendix B that led to the joint pdf $hi g h(^, t) (IB.14[) 
of the high value H and of the cdf of the occurrence time thigh- Namely, we 
calculate first the probability 

F(h, £, r)dhd£ = Pr{H G (h,h + dh),C^ (£, I + d£),r last < r}, (C.3) 
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where 

T-L = sup 2(r), C = inf 2(t), 

re(0,oo) re(0,oo) 
Tlast = SUp{Ti ow , Thigh}, Tiow- £ = >2(ti ow ), Thigh: "H = >2(Thigh)- 

Analogously to f)B.9j) . F(h,£,r) is equal to 

rh(l+r) 

F(h,£,r) = Q{u,h,£,T)P{u,h,£,T)du, (C.4) 

Jt(l+r) 

where 

Q(u>,h,t,r) = (C.5) 

and the pdf f{yj\ h, £, r) satisfies the initial-boundary problem 
Of 1 d 2 f 

Yr = 2^ /(w;M,r = 0) = {Q _ g) 

f(w = h(l + T);h,l,T) = 0, f(u = £(1 + T);h,£,r) = 0, t > 0. 
Similarly to P(u, h, r) (IB. lOj) . the probability /i, t) is given by 

h, £, t) = lim P(u, h, £, t, 0), 

8— >oo 

P(u, h, £, t, 9) = Pi{£(1 + t) < 5t'|t, w) < h(l + t') : t' G (t, t + 9)}. 
Analogously to (IB. lip , the last probability P(co, h, £, r, 9) is equal to 

MI+t+6) 

P(u,h,£,T,9)= f(x;ou,h,£,r,9)dx, (C.7) 

Je(i+r+e) 

where f(x; u, h, £, r, 9) is the solution of the initial-boundary problem 
df ld 2 f 

d9 = 2dx^' f(x;u],h,£,T,6 = 0) = 8(x-u]), 

fix = + r + 0); w, h, £, t, 9) = 0, fix = £(l + r + 9); u, h, £, r, 9) = 0. 

(C.8) 

Knowing the function F{h,£,r) defined by equality (]C.3|) . one can find 
the sought function $i ast (/i, £, t) (IC 2[) using the following relation 

which is analogous to (IB.14j) . In turn, one can find the joint pdf of the high 
H, low L values (IA.2j) and occurrence time of the last absolute extremum ti ast 
flC.lj) of the canonical bridge Y(t) using, analogously to (IB. 15|) . the relation 

d$ last (h,£,t) 

<P]aBtih,£,t) = ^ . (CIO) 
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C.2 Solutions of boundary- value problems 

Using the initial-boundary problem (1C.6j) with the reflection method, we 
obtain 



/(w;M,r)= [e- 2{h - e)2m2 9(oo + 2(h-e)m;r)- 

m=—oo 

e -2«h-t)m+hr g ( u _ 2 {h + (h- £)m); t)] , 



(C.U) 



where 



V2 



exp 



TIT 



CO 

'2t 



In turn, the solution of the initial-boundary problem (IC.8I) is given by 

oo 

f(x;ou,h,£,r,e)= Yl e- 2 ^ 2m2 ^ +2 ^ h ~ e ~> m x 

m=— oo L 

g(y - u + 2m(A - £)(1 + r); 

e -2((fc-/) m +fc) a (l+r)+2 W ((h-fl TO +ft)^ (2/ + ^ _ 2((/l _ £)m + h )( 1 + r ) ; 6 



After substituting f(uj; h,£,r) (1C.11I) into (1C.5I) . we obtain 



(C.12) 



q(^,m,t)=-x: 



??? 



me- 2 ^ 2m2 x 



[(w + 2m(/i - + r)) 2 - r(l + r)](/(w + 2m(/i - £), r)- 

(l + m)e- 2 ( m ^ +/l ) 2 x 

[(w - 2(m(h —£) +h)(l + r) f - r(l + r)]#(u; - 2(m(/i - 1) + /i), r) 

(C.13) 

Substituting f(x; u>, h, £, r, 6) flC12[) into (lC.7p . and taking the limit — > oo, 
we obtain 



P(u),h,£,r) 



oo 
m=— oo 



(3 2(h~i) 2 (l+r)m 2 +2(h-e)mu _ e -2(h+(h-i)m) 2 (l+T)+2(h+(h-i)m)u> 



(C.U) 

After substituting Q(u, h, £, r) flUl3|) and P(u, h, £, r) flUlij) into (Q, 
we obtain the explicit expression for F(h,£,r). Substituting it into (1C.9j) 
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and using relation ( lC.9p . we obtain the pdf of the high H, low L values and 
occurrence time of the last extremum under the form 



oo oo 



m 



<P)BBt(h,£,t)= ^2 
m=— oo n=~c 

g(h, t, 2(h - £)m, 2(h - £)n) - g(£, t, 2(h - £)m, 2(h - £)n)- 
g(h, t, 2(h - £)m, 2(h + (h - t)n)) + g{£, t, 2(h - £)m, 2(h + (h - £)n)) 

m(m + 1) g(h, t, -2(h + (h - £)m), 2(h - £)n)- 
g(t, t, -2{h +(h- e)m), 2(h - £)n)- 
g(h, t, -2(h + {h- £)m), 2(h + (h- £)n))+ 

g(£, t, -2(h + (h- £)m), 2(h + (h — £)n)) 

(a + yf - (a + c)(a-c + 2y)t 



g(y,t,a,c) = -,l n{i _ t)H7 e Wy ^ _ f) 

[(a + y) 3 - (a + y)(3 + (a + y)(a - c + 2y))t + (3a - c + Ay)t 2 ] . 

C.3 Function a\ as t(0, t; A) 



(C.15) 



Some of the most efficient estimators introduced in this paper are defined 
through the function 



ai ast (6>,i; A) = / r A+ Viast(r cos 9,r sin 0,t)dr, 



(C.16) 
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which is analogous to (1751) . The calculation of the integral flCJ.16j) yields 



3 + A 



E 



m,n=— oo 



^8vr(l - t) W+ A V 2 
/3(co, t, 2sc ■ m, 2sc ■ n; A) — /3(sz, £, 2sc • m, 2sc • n; A) 
/3(co, t, 2sc ■ m, 2(co + sc ■ n); A) + 

/3(sz, t, 2sc • m, 2(co + sc • n); A) — 

m(m + 1) /9(co, t, — 2(co + sc • m), 2sc • n; A) — 
/3(si, t, — 2(si + sc • m), 2sc • n; A) — 
/9(co, £, — 2(co + sc ■ m), 2(co + sc ■ n); A) + 

/3(si, t, — 2(co + sc ■ m), 2(co + sc • n); A) 



m 2 x 



where 



co = cos 9, si = sin 9, sc = co — si, 
P(y, t, a, c; A) = (a + y) 2 [a + y - (a - c + 2y)t](3 + A) + 
5t[(6a-2c + 8?/)t-6(a + ?/)] , 

(a + y) 2 - (a + c)(a - c + 2y)t 



2t(l - 1) 



(C.17) 



5 = 5(y,t,a,c) 

C.4 Function a(0, l>, t; A) 
Consider the function 

POO 

a(9,v,t;X)= / r A+2 y?i ast (r cost; cos^, r cosv sin^, r shit;, £; 7 = 0)dr, 

(C.18) 

that enters into the definition of the canonical estimator (11091) in the case 
of zero drift 7 = 0. Using expression f)112p for the pdf <piaat(h, i, x, t; 7), we 



37 



obtain after calculations the following expression 

r(¥) 



a(9, v, t; A) 



m 2 x 



co, t, sc ■ m,sc ■ n; A) — f3'(x, si, t, sc ■ m,sc ■ n; A) 
(3'(x, co, t, sc ■ m,cc + sc ■ n; A) + (3'(x, si, t, sc ■ m,cc + sc ■ n 

m(m + 1) j3'(x, co, t, —cc — sc ■ m, sc ■ n; A) — 
f3'(x, si, t, —cc — sc ■ m, sc ■ n; A) — 

(3'(x, co, t, —cc — sc ■ m,cc + sc ■ n; A)+ 

(3'(x, si, t, —cc — sc ■ m,cc + sc ■ n; A) 



which is analogous to (IC.17j) . Here, we have set 



x = sinv, co = cos 9 cos v , si = sin 9 cosv, 
cc = 2 cos 9 cosv, sc = 2(cos 9 — sin 9) cos v, 

P'(x, y, t, a, c, A) = [r(4 + A) (a + y - (a - c + 2y)t) + 
5t{{6a - 2c + 8y)t - 6(a + y))] S~ (6+A)/2 , 



r — (a + c)(a — c + 2y)t x 2 
= 2t(l - t) + T' 
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Fig. 1: 5000 realizations of the open-close estimator (jl 14j) . of the 
G&K estimator fjl X5[) , and of the most efficient estimator cit-me-x (|109P • 
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Fig. 2: Moving averages of the open-close (top panel), G&K (middle 
panel) and time-OHLC f)109j) (lower panel) estimators over respective 
windows sizes of 30 samples for the top panel and 10 samples for 
the two other panels. As explained in the text, this moving average 
mimicks the normalized estimators of the integrated variance in the 
case where all instantaneous variances are the same (cr| = a 2 = const). 
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Fig. 3: Top to bottom, 7-dependence of the expected values of the 
open-close d re& \ (| 1 1 4 [) . G&K dcK (|115p and most efficient dt-me-x (|109|) 
canonical estimators. The horizontal line is the expected value of the 
canonical estimators dbp ar k and dt- me 
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Fig. 4: 7-dependence of the statistical average of the variances of 
the canonical estimators g!gk (upper open circles) and dt-me-x (lower 
open circles). The two horizontal lines are the variances (|117p of the 
canonical estimators dbp ar k (top) and dt-me (bottom), respectively. 
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